XML for Beginners 
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1. XML — the Snake Oil of the Internet age? 
2. Basic XML Concepts 

3. Defining XML Data Formats 

4. Querying XML Data 
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Snake Oil? 


e Snake Oil is the all-curing drug these strange guys in 
wild-west movies Sell, travelling from town to town, but 
visiting each town only once. 


e Google: „snake oil“ xml 
=> some 2000 hits 
e „XML revolutionizes software development” 


e „XML is the all-healing, world-peace inducing tool for 
computer processing” 


e „XML enables application portability” 
e „Forget the Web, XML is the new way to business“ 


e „XML is the cure for your data exchange, information 
integration, data exchange, [x-2-y], [you name it] problems” 


e „XML, the Mother of all Web Application Enablers“ 
e XML has been the best invention since sliced bread‘ 
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XML is not... 


e A replacement for HTML 
(but HTML can be generated from XML) 


e A presentation format 

(but XML can be converted into one) 
e A programming language 

(but it can be used with almost any language) 
e A network transfer protocol 

(but XML may be transferred over a network) 


e A database 
(but XML may be stored into a database) 


April 29th, 2003 Organizing and Searching Information with XML 3 


But then — what is it? 
XML is a meta markup language 
for text documents / textual data 


XML allows to define languages 


(, applications*‘) to represent text 
documents / textual data 
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XML by Example 


<article> 
<author>Gerhard Weikum</author> 
<title>The Web in 10 Years</title> 


</article> 


e Easy to understand for human users 
e Very expressive (semantics along with the data) 


e Well structured, easy to read and write from programs 


This looks nice, but... 
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XML by Example 
... this is XML, too: 
<t108> 


<x87>Gerhard Weikum</x87> 


<g10>The Web in 10 Years</g10> 
</t108> 
e Hard to understand for human users 


e Not expressive (no semantics along with the data) 


e Well structured, easy to read and write from programs 
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XML by Example 
... and what about this XML document: 


<data> 


ch37fhgks73 j5mv9d63h5mgfkds8d984lgnsmcns 983 
</data> 


e Impossible to understand for human users 
e Not expressive (no semantics along with the data) 


e Unstructured, read and write only with special programs 


The actual benefit of using XML highly depends 


on the design of the application. 
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Possible Advantages of Using XML 


e Truly Portable Data 

e Easily readable by human users 

e Very expressive (semantics near data) 

e Very flexible and customizable (no finite tag set) 
e Easy to use from programs (libs available) 


e Easy to convert into other representations 
(XML transformation languages) 


e Many additional standards and tools 


e Widely used and supported 
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t. 


Clients 


Converters 


XML2WML XML2PDF 


Database with 
XML documents 
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App. Scenario 2: Data Exchange 


Buyer Su 


(BMECat, ebXML, RosettaNet, BizTalk, ...) 


Legacy 
System 


(e.g., 
Cobol) 
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Ei Adobe Acrob 
Gra] Datei Bearb 


eeng 
EESIN 


Lesezeichen 


Piktogramme 


Kommentare 


Unterschriften 


. Scenario 3: XML for Metadata 


<rdf:Description rdf:about="http: //www-dbs/Sch03.pdf"> 
<dc:title>A Framework for...</dc:title> 
<dc:creator>Ralf Schenkel</dc:creator> 
<dc:description>While there are...</dc:description> 
<dc:publisher>Saarland University</dc:publisher> 
<dc:subject>XML Indexing</dc: subject> 


<dc:rights>Copyright ...</dc:rights> 
<dc:type>Electronic Document</dc:type> 
<dc: format>text/pdf</dc: format> 
<dc : language>en</dc: language> 
</rdf:Description> 
</rdf:RDF> 


or intra- or inter-document Hunks. In addition, 1t 


is unclear for many of the approaches if they are 


applicable for Web-scale document collections. In 
this paper we present a new proposal for a frame 
work for path indexing that integrates the existing 
indexing approaches and supports both links and 
large, inter-linked document collections. Addition 
ally. we identify tasks that could be done as a part 
of a student’s pr ject. 


ejn 4 
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data graph G (V.E) for an XML document 
d (this graph is typically directed, but may also 
be treated as an undirected graph for some appli 
cations), and compute its transitive closure C 

(V, BE’). Here, C is graph that has a (directed) edge 
from x to y if there is a path from 7z to y in G. The 
adjacency matrix A of C then serves as path index 
for the document: There is a path from x to y in 
G iff A[z,y] = 1. 


ture 


As an extension of this struc 


one mav store the distance of two elements 
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App. Scenario 4: Document Markutr 


<article> 
<section id=,1“ title=,Intro™> 
This article is about <index>XML</index>. 
</section> 
<section id=,2™“ title=,Main Results™“> 


<name>Weikum</name> <cite idref=,Weik01“/> shows 
the following theorem (see Section <ref idref=,,1“/>) 


<theorem id=,theo:1“ source=,wWeik01“> 
For any XML document x, 
</theorem> 
</section> 
<literature> 
<cite id=,Weik01“><author>Weikum</author></cite> 
</literature> 


</article> 
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App. Scenario 4: Document Markutr 


¢ Document Markup adds structural and semantic 
information to documents, e.g. 
— Sections, Subsections, Theorems, ... 
— Cross References 
— Literature Citations 
— Index Entries 
— Named Entities 


e This allows queries like 
— Which articles cite Weikum‘s XML paper from 2001? 
— Which articles talk about (the named entity) ,,.Weikum*? 
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XML for Beginners 


Part 2 — Basic XML Concepts 


2.1 XML Standards by the W3C 
2.2 XML Documents 


2.3 Namespaces 


April 29th, 2003 Organizing and Searching Information with XML 


2.1 XML Standards — an Overview 


e XML Core Working Group: 
— XML 1.0 (Feb 1998), 1.1 (candidate for recommendation) 
— XML Namespaces (Jan 1999) 
— XML Inclusion (candidate for recommendation) 
XSLT Working Group: 
— XSL Transformations 1.0 (Nov 1999), 2.0 planned 
— XPath 1.0 (Nov 1999), 2.0 planned 
— eXtensible Stylesheet Language XSL(-FO) 1.0 (Oct 2001) 
XML Linking Working Group: 
— XLink 1.0 (Jun 2001) 
— XPointer 1.0 (March 2003, 3 substandards) 
e XQuery 1.0 (Nov 2002) plus many substandards 


e XMLSchema 1.0 (May 2001) 
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2.2 XML Documents 


What‘s in an XML document? 
e Elements 
e Attributes 


e plus some other details 
(see the Lecture if you want to know this) 
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A Simple XML Document 


<article> 

<author>Gerhard Weikum</author> 

<title>The Web in Ten Years</title> 

<text> 
<abstract>In order to evolve...</abstract> 
<section number=“1” title=“Introduction”> 

The <index>Web</index> provides the universal... 

</section> 

</text> 


</article> 
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A Simple XML Document 


<article> 

<author>G ard Weikum</au ILL 

<title>The Web in Te ears</title> 

<text> 
<abstract>iIn ordef to evolve...</abstract> 
<section numbér=“1” title=“Introduction”> 

The <index>Web</index> provides the universal... 

</section> 

</text> 


</article> 
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A Simple XML Document 


<article> 
<author>Gerhard wéikum</author> 
<title>The~Web in Ten Years</title> 


</articke> 


Element 
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A Simple XML Document 


<article> 
<author>Gerhard Weikum</author> 
<title>The Web in Ten Years</title> 
<text> 
<abstract>In order to evolve...</abstract> 
The <index>Wele</index> prowides the universal... 
</section> 
</text> 


</article> 
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Elements in XML Documents 


e (Freely definable) tags: article, title, author 
— with start tag: <article> etc. 


— and end tag: </article> etc. 
e Elements: <article> ... </article> 
e Elements have a name (article) and a content (...) 
e Elements may be nested. 
e Elements may be empty: <this_is_empty/> 


e Element content is typically parsed character data (PCDATA), 
1.e., strings with special characters, and/or nested elements (mixed 
content if both). 


e Each XML document has exactly one root element and forms a 
tree. 


e Elements with a common parent are ordered. 
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Elements vs. Attributes 


Elements may have attributes (in the start tag) that have a name and 
a value, e.g. <section number="1">. 
What is the difference between elements and attributes? 


e Only one attribute with a given name per element (but an arbitrary 
number of subelements) 


e Attributes have no structure, simply strings (while elements can 
have subelements) 


As a rule of thumb: 
e Content into elements 
e Metadata into attributes 


Example: 
<person born=“1912-06-23“ died=“1954-06-07™“> 


Alan Turing</person> proved that... 
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XML Documents as Ordered Trees 


article 

author title text 

: ee number=“1" 

i abstract section < 
Gerhard ed ie \ title= 
Weikum ger aes 

: In order ... The index provides... 
The Web : 
Web 


in 10 years 
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More on XML Syntax 


e Some special characters must be escaped using entities: 
< — &lt; 
& —> &amp; 
(will be converted back when reading the XML doc) 
e Some other characters may be escaped, too: 
> — &gt; 
“ ç — &quot; 
` —> &apos; 
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Well-Formed XML Documents 


A well-formed document must adher to, among others, the 
following rules: 


e Every start tag has a matching end tag. 

e Elements may nest, but must not overlap. 
e There must be exactly one root element. 
e Attribute values must be quoted. 


e An element may not have two attributes with the same 
name. 


e Comments and processing instructions may not appear 
inside tags. 


e No unescaped < or « signs may occur inside character 
data. 
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Well-Formed XML Documents 


A well-formed document must adher to, among others, the 
following rules: 


e Every start tag has a matching end tag. 
e Elements may nest, but must not overlap. 


e Comments and processing instructions may not appear 
inside tags. 


e No unescaped < or « signs may occur inside character 
data. 
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<library> 
CCdescription>Dibrary of the CS Department</description> 
<book bid=“HandMS2000™“> 
<title>Principles of Data Mining</title> 
Short introduction to <em>data mining</em>, useful 
for the IRDM course 
</description> 
</book> 
</library> 


Semantics of the description element is ambigous 


Content may be defined differently 
Renaming may be impossible (standards!) 


=> Disambiguation of separate XML applications using 
unique prefixes 
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Namespace Syntax 


< dbs : book ST http: //www-dbs/dbs/> 


Prefix as abbrevation 
of URI 
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Namespace Example 


<dbs:book xmlns:dbs=“http: //www-dbs/dbs “> 
<dbs:description> ... </dbs:description> 
<dbs : text> 
<dbs : formula> 


<mathml:math 
xmlns :mathml=“http: //www.w3.org/1998/Math/MathML“> 


</mathm1 :math> 
</dbs : formula> 
</dbs: text> 
</dbs :book> 
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Default Namespace 


e Default namespace may be set for an element and its 
content (but not its attributes): 
<book xmlns=“http: //www-dbs/dbs“> 
<description>...</description> 
<book> 
e Can be overridden in the elements by specifying the 
namespace there (using prefix or default namespace) 
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XML for Beginners 


Part 3 — Defining XML Data Formats 


3.1 Document Type Definitions 
3.2 XML Schema (very short) 
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3.1 Document Type Definitions 


Sometimes XML 1s too flexible: 


e Most Programs can only process a subset of all possible 
XML applications 


e For exchanging data, the format (1.e., elements, 
attributes and their semantics) must be fixed 


—>Document Type Definitions (DTD) for establishing the 
vocabulary for one XML application (in some sense 
comparable to schemas in databases) 


A document is valid with respect to a DTD if it conforms 
to the rules specified in that DTD. 


Most XML parsers can be configured to validate. 


April 29th, 2003 Organizing and Searching Information with XML 32 


DTD Example: Elements 


<!ELEMENT article (title, author+, text) > 
<!ELEMENT title (#PCDATAY> | 


<!ELEMENT author 
<!ELEMENT text (abstfact, section*, literature?) > 
<!ELEMENT (#PGDATA) > 

<!ELEMENT seg 
<!ELEMENT 
<!ELEMENT 


#PCDATA) > 
(#PCDATA) > 


Content of the text element may 
contain zero or more section 
elements in this position 
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Element Declarations in DTDs 


One element declaration for each element type: 


<!ELEMENT element_name content_specification> 
where content_specification Can be 

- (#PCDATA) parsed character data 

© (child) one child element 

- (cl,..,en) a sequence of child elements cl...cn 
e (cl1|..Jcen) one of the elements cl...cn 


For each component c, possible counts can be specified: 


— Cc exactly one such element 
— C+ one or more 

— c* zero or more 

— c? zero or one 


Plus arbitrary combinations using parenthesis: 
<!ELEMENT f ((a|b)*,ct+, (dle) ) *> 
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More on Element Declarations 


e Elements with mixed content: 
<!ELEMENT text (#PCDATA|index|cite|glossary) *> 


e Elements with empty content: 
<!ELEMENT image EMPTY> 
e Elements with arbitrary content (this is nothing for 
production-level DTDs): 


<!ELEMENT thesis ANY> 
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Attribute Declarations in DTDs 


Attributes are declared per element: 


<!ATTLIST section CDATA | #REQUIRED 


tikle CDATA #REQUERED> 


declares tvvo requir¢d attributes for element section. 


element name 
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Attribute Declarations in DIDs 


Attributes are declared per element: 
<!ATTLIST section number CDATA #REQUIRED 
title CDATA #REQUIRED> 


declares two required attributes for element section. 


Possible attribute defaults: 

- #REQUIRED is required in each element instance 
- #IMPLIED is optional 

- #FIXED default always has this default value 


- default has this default value if the attribute is 
omitted from the element instance 
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Attribute Types in DTDs 


- CDATA string data 


(A1|..|An) enumeration of all possible values of the 
attribute (each is XML name) 


ID unique XML name to identify the element 


<- IDREF refers to rp attribute of some other element 
(,,1ntra-document link**) 


- IDREFS list of rpReEF, Separated by white space 


e plus some more 
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Attribute Examples 


<ATTLIST publication type (journal|inproceedings) #REQUIRED 
pubid ID #REQUIRED> 

<ATTLIST cite cid IDREF #REQUIRED> 

<ATTLIST citation ref IDREF #IMPLIED 
cid ID #REQUIRED> 


<publications> 
<publication type=“journal™ pubid=“Weikum01“> 
<author>Gerhard Weikum</author> 
<text>In the Web of 2010, XML <cite cid=,12“/>...</text> 
<citation cid=,12“ ref=,,XML98“/> 


<citation cid=,,15“>...</citation> 


</publication> 

<publication type=“inproceedings™ pubid=“XML98™“> 
<text>XML, the extended Markup Language, ...</text> 

</publication> 


</publications> 
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Attribute Examples 


<ATTLIST publication type (journal|inproceedings) #REQUIRED 
pubid ID #REQUIRED> 

<ATTLIST cite cid IDREF #REQUIRED> 

<ATTLIST citation ref IDREF #IMPLIED 
cid ID #REQUIRED> 


<publications> 


<publication |type=“journal™| pubid=“Weikum01“> 
<author>Gerhard Weikum</author> 


<text>In the Wq@brOT ZUTUp.xML <cite 
cid=,,12“|ref=, XML% 


<citation cid=,15“>...</e 


>...</text> 


<citation 


</publication> 

<publication type=“inproceedings™ 
<text>XML, the extended Markup Language, ...</text> 

</publication> 


</publications> 
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Linking DID and XML Docs 


e Document Type Declaration in the XML document: 
< “http: //www-dbs/article.dtd"} 
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Linking DTD and XML Docs 


e Internal DTD: 


<?xml version=“1.0“?> 
<!DOCTYPE article [ 
<!ELEMENT article (title, author+, text) > 


<!ELEMENT index (#PCDATA) > 
]> 


<article> 


</article> 


e Both ways can be mixed, internal DTD overwrites 


external entity information: 
<!DOCTYPE article SYSTEM „article.dtd“ [ 
<!ENTITY % pub_content (title+,author*,text) 


> 
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Flaws of DTDs 


e No support for basic data types like integers, doubles, 
dates, times, ... 


e No structured, self-definable data types 
e No type derivation 


e id/idref links are quite loose (target is not specified) 


— XML Schema 
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3.2 XML Schema Basics 


e XML Schema is an XML application 


e Provides simple types (string, integer, dateTime, 
duration, language, ...) 


e Allows defining possible values for elements 

e Allows defining types derived from existing types 

e Allows defining complex types 

e Allows posing constraints on the occurrence of elements 


e Allows forcing uniqueness and foreign keys 


e Way too complex to cover in an introductory talk 
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Simplified XML Schema Example 


<xs:schema> 


<xs:element name=“article™> 
<xs :complexType> 
<xs : sequence> 
<xs:element name=“author™ type=“xs:string‘/> 
<xs:element name=“title™“ type=“xs:string™“/> 
<xs:element name=“text™“> 
<xs :complexType> 
<xs : sequence> 
<xs:element name=“abstract™ type=“xs:string‘/> 
<xs:element name=“section™ type=“xs:string™“ 
minOccurs=“0“ maxOccurs=“unbounded™/> 
</xs:sequence> 
</xs:complexType> 
</xs:element> 
</xs : sequence> 
</xs:complexType> 
</xs:element> 


</xs:schema> 
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XML for Beginners 


Part 4 — Querying XML Data 


4.1 XPath 
4.2 XQuery 
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Querying XML with XPath and XQuery 


XPath and XQuery are query languages for XML data, both 
standardized by the W3C and supported by various database products. 
Their search capabilities include 
e logical conditions over element and attribute content 

(first-order predicate logic a la SQL; simple conditions only in XPath) 
e regular expressions for pattern matching of element names 

along paths or subtrees within XML data 
+ joins, grouping, aggregation, transformation, etc. (XQuery only) 


In contrast to database query languages like SQL an XML query 
does not necessarily (need to) know a fixed structural schema 
for the underlying data. 

A query result is a set of qualifying nodes, paths, subtrees, 

or subgraphs from the underyling data graph, 

or a set of XML documents constructed from this raw result. 
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4.1 XPath 


e XPath is a simple language to identify parts of the XML 
document (for further processing) 


e XPath operates on the tree representation of the 
document 


e Result of an XPath expression is a set of elements or 
attributes 


e Discuss abbreviated version of XPath 
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Elements of XPath 


e An XPath expression usually is a location path that 
consists of location steps, separated by /: 
/article/text/abstract: selects all abstract elements 


e A leading / always means the root element 


e Each location step is evaluated in the context of a node 
in the tree, the so-called context node 
e Possible location steps: 
— child element x: select all child elements with name x 
— Attribute @x: select all attributes with name x 
— Wildcards * (any child), @* (any attribute) 
— Multiple matches, separated by |: x|y|z 
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Combining Location Steps 


e Standard: / (context node is the result of the preceding 
location step) 
article/text/abstract (all the abstract nodes of articles) 


e Select any descendant, not only children: // 


article//index (any index element in articles) 
e Select the parent element: .. 


e Select the content node: . 


The latter two are important when using predicates. 


April 29th, 2003 Organizing and Searching Information with XML 50 


Predicates in Location Steps 


e Added with [] to the location step 


e Used to restricts elements that qualify as result of a 
location step to those that fulfil the predicate: 
- a[b] elements a that have a subelement b 
- a[@d] elements a that have an attribute d 


— Plus conditions on content/value: 
- a[b=,,c™] 
< A[Q@d>7] 
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XPath by Example 


/literature/book/author retrieves all book authors: 
starting with the root, traverses the tree, matches element 
names literature, book, author, and returns elements 
<author>Suciu, Dan</author>, 
<author>Abiteboul, Serge</author>. ..., 
<author><firstname>Jeff</firstname> 
<lastname>Ullman</lastname></author> 


/literature/(booklarticle)/author authors of books or articles 


/literature/*/author authors of books, articles, essays, etc. 

/literature//author authors that are descendants of literature 
/literature//@ year value of the year attribute of descendants of literature 
/literature//author[firstname] authors that have a subelement firstname 


/literature/book[price < ,,50*] low priced books 


/literature/book[author//country = ,,Germany“] books with German author 
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4.2 Core Concepts of XQuer 


XQuery is an extremely powerful query language for XML data. 
A query has the form of a so-called FLWR expression: 


FOR $varl IN expr1l, $var2 IN expr2, 
LET Svar3 := expr3, Svar4 := expr4, 
WHERE condition 

RETURN result-—doc-construction 


The FOR clause evaluates expressions (which may be XPath-style 
path expressions) and binds the resulting elements to variables. 
For a given binding each variable denotes exactly one element. 


The LET clause binds entire sequences of elements to variables. 


The WHERE clause evaluates a logical condition with each of 
the possible variable bindings and selects those bindings that 
satisfy the condition. 


The RETURN clause constructs, from each of the variable bindings, 
an XML result tree. This may involve grouping and aggregation 
and even complete subqueries. 
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XQuery Examples 


// find Web-related articles by Dan Suciu from the year 1998 
<results> { 
FOR $a IN document (“literature.xml™) //article 


FOR $n IN Sa//author, $t IN Sa/title 
WHERE Sa/@year = “1998™ 


AND contains ($n, “Suciu“’“) AND contains(St, “Web™) 
RETURN <result> Sn St </result> } </results> 


// find articles co-authored by authors who have jointly written a book after 1995 
<results> { 


FOR Sa IN document (“literature.xml“)//article 
FOR $al IN Sa//author, $a2 IN Sa//author 


WHERE SOME $b IN document (“literature.xml™“) //book SATISFIES 
S$b//author = $al AND $b//author = 


= $a2 AND Sb/@year>“1995™ 
RETURN <result> Sal $a2 <wrote> Sa </wrote> </result> } 
</results> 
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Summary and Outlook 


You should give one, I won‘t. 
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